Keras Multi-layer Perceptron

Dataset

a random n-class classification dataset can be generated using sklearn.datasets.make_classification. Here, we generate a dataset with two features and 1000 instances. Moreover, the dataset is generated for multiclass classification with five classes.

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from num2words import num2words

n_features =2
n_classes = 2
X, y = make_classification(n_samples = int((n_classes-1)*1e3), n_features = n_features, n_redundant=0, n_classes = n_classes,
                           n_informative=2, random_state=1, n_clusters_per_class=1)
Labels_dict = dict(zip(list(np.unique(y)), [num2words(x).title() for x in np.unique(y)]))

Data = pd.DataFrame(data = X, columns = ['Feature %i' % (i+1) for i in range(n_features)])
Target = 'Outcome Variable'
Data[Target] = y
display(Data)

from HD_DeepLearning import Plot_Data
    
# mycmap = LinearSegmentedColormap.from_list('mycmap', ['OrangeRed', 'RoyalBlue'])    
PD = dict(BP = .5, alpha=.7, bg_alpha = 0.25, grid = True, cricle_size = 50,
          FigSize = 7, h=0.02, pad=1, ColorMap =  'bwr',
          Labels = list(Labels_dict.values()))

Plot_Data(X, y, PD = PD, Labels_dict = Labels_dict, ax = None)
Feature 1 Feature 2 Outcome Variable
0 1.536830 -1.398694 1
1 1.369176 -0.637344 1
2 0.502318 -0.459105 1
3 1.833193 -1.298082 1
4 1.042356 1.121529 0
... ... ... ...
995 0.535224 0.435245 1
996 1.069692 -0.129909 1
997 1.820267 -2.957167 1
998 1.004999 0.936290 0
999 1.462110 1.144978 0

1000 rows × 3 columns

Train and Test Sets

In [2]:
Pull = [.01 for x in range((len(Labels_dict)-1))]
Pull.append(.1)

import plotly.express as px
from HD_DeepLearning import DatasetTargetDist
PD = dict(PieColors = px.colors.sequential.Rainbow[0:-1:3], TableColors = ['Navy','White'], hole = .4,
          column_widths=[0.6, 0.4], textfont = 14, height = 400, tablecolumnwidth = [0.25, 0.15, 0.15],
          pull = Pull, legend_title = Target, title_x = 0.5, title_y = .9, pie_legend = [0.01, 0.01])
del Pull
DatasetTargetDist(Data, Target, Labels_dict, PD, orientation= 'columns')

StratifiedKFold is a variation of k-fold which returns stratified folds: each set contains approximately the same percentage of samples of each target class as the complete set.

In [3]:
from sklearn.model_selection import StratifiedShuffleSplit

Test_Size = 0.3
sss = StratifiedShuffleSplit(n_splits=1, test_size=Test_Size, random_state=42)
_ = sss.get_n_splits(X, y)
for train_index, test_index in sss.split(X, y):
    # X
    if isinstance(X, pd.DataFrame):
        X_train, X_test = X.loc[train_index], X.loc[test_index]
    else:
        X_train, X_test = X[train_index], X[test_index]
    # y    
    if isinstance(y, pd.Series):
        y_train, y_test = y[train_index], y[test_index]
    else:
        y_train, y_test = y[train_index], y[test_index]
del sss

from HD_DeepLearning import Train_Test_Dist    
PD.update(dict(column_widths=[0.3, 0.3, 0.3], tablecolumnwidth = [0.2, 0.4], height = 550, legend_title = Target))

Train_Test_Dist(X_train, y_train, X_test, y_test, PD, Labels_dict)

Modeling: Multi-layer Perceptron (MLP) for Binary classification

A multi-layer perceptron (MLP) is a class of feedforward artificial neural network (ANN). scikit-learn.org has a well-written article regarding MLP and interested readers are encouraged to see this article.

In this article, an MLP for binary classification using Keras is presented. We define our model by using Sequential class. Moreover, we consider the rectified linear unit) (ReLU) as the activation function. An activation function allows for complex relationships in the data to be learned. For the last year, we use the Sigmoid function. Some examples of Sigmond functions can be found here.

In [4]:
import tensorflow as tf

model = tf.keras.Sequential(name = 'Binary_MLP')
model.add(tf.keras.layers.Dense(64, input_dim = X.shape[1], activation='relu', name='Layer1'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(64, activation='relu', name='Layer2'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(1, activation='sigmoid', name='Layer3'))
model.summary()
tf.keras.utils.plot_model(model, show_shapes=True, show_layer_names=True, expand_nested = True, rankdir = 'LR')
Model: "Binary_MLP"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 Layer1 (Dense)              (None, 64)                192       
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 Layer2 (Dense)              (None, 64)                4160      
                                                                 
 dropout_1 (Dropout)         (None, 64)                0         
                                                                 
 Layer3 (Dense)              (None, 1)                 65        
                                                                 
=================================================================
Total params: 4,417
Trainable params: 4,417
Non-trainable params: 0
_________________________________________________________________
Out[4]:
In [5]:
# Number of iterations
IT = int(5e2)+1

model.compile(optimizer='sgd', loss='mse', metrics=['accuracy', tf.keras.metrics.Recall()])

# Train model
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs= IT, batch_size=128, verbose = 0)
In [6]:
def Search_List(Key, List): return [s for s in List if Key in s]

Metrics_Names = {'loss':'Loss', 'accuracy':'Accuracy', 'mae':'MAE', 'mse':'MSE', 'recall': 'Recall'}

from HD_DeepLearning import history2Table

Validation_Table = Search_List('val_',history.history.keys()) 
Train_Table = list(set( history.history.keys()) - set(Validation_Table))
Validation_Table = pd.DataFrame(np.array([history.history[x] for x in Validation_Table]).T, columns = Validation_Table)
Train_Table = pd.DataFrame(np.array([history.history[x] for x in Train_Table]).T, columns = Train_Table)
Validation_Table.columns = [x.replace('val_','') for x in Validation_Table.columns]

Train_Table = history2Table(Train_Table, Metrics_Names)
Validation_Table = history2Table(Validation_Table, Metrics_Names)

# Train Set Score
score = model.evaluate(X_train, y_train, batch_size=128, verbose = 0)
score = pd.DataFrame(score, index = model.metrics_names).T
score.index = ['Train Set Score']

# Validation Set Score
Temp = model.evaluate(X_test, y_test, batch_size=128, verbose = 0) 
Temp = pd.DataFrame(Temp, index = model.metrics_names).T
Temp.index = ['Validation Set Score']
score = pd.concat([score, Temp])
score.rename(columns= Metrics_Names, inplace = True)
score = score.reindex(sorted(score.columns), axis=1)
display(score.style.format(precision=4))
  Accuracy Loss Recall
Train Set Score 0.9029 0.0620 0.8771
Validation Set Score 0.8767 0.0720 0.8533
In [7]:
from HD_DeepLearning import Plot_history
PD = dict(row_heights = [0.4, 0.6], lw = 1.5, font_size=12, height = 700, yLim = 1,
          th_line_color = 'Navy', th_fill_color='darkslategray', table_columnwidth = [0.4, 0.4, 0.4, 0.4],
          tc_line_color = 'Navy', tc_fill_color = None, title_x = 0.46, title_y = 0.94, tb_cell_heigh = 20,
          Number_Format = '%.4e')

Plot_history(Train_Table, PD, Title = 'Train Set', Table_Rows = 10, Colors = ['RoyalBlue', 'DarkGreen', 'Red'])
Plot_history(Validation_Table, PD, Title = 'Validation Set', Table_Rows = 10, Colors = ['RoyalBlue', 'DarkGreen', 'Red'])
In [8]:
from HD_DeepLearning import Plot_Classification
import matplotlib.pyplot as plt

PD = dict(BP = .5, alpha=.7, bg_alpha = 0.15, grid = False, cricle_size = 50,
          FigSize = 7, h=0.02, pad=1, ColorMap =  'bwr', Labels = list(Labels_dict.values()))

fig, ax = plt.subplots(1, 2, figsize=(16, 7))
# Train Set
Plot_Classification(model, X_train, y_train, PD = PD, ax = ax[0])
_ = ax[0].set_title('Train Set', fontsize = 16, weight='bold')
# Test Set
Plot_Classification(model, X_test, y_test, PD = PD, ax = ax[1])
_ = ax[1].set_title('Test Set', fontsize = 16, weight='bold')

Confusion Matrix

The confusion matrix allows for visualization of the performance of an algorithm. Note that due to the size of data, here we don't provide a Cross-validation evaluation. In general, this type of evaluation is preferred.

In [9]:
from sklearn import metrics

# Train
y_pred = np.round(model.predict(X_train))
Reports_Train = pd.DataFrame(metrics.classification_report(y_train, y_pred, target_names=list(Labels_dict.values()),
                                                           output_dict=True)).T
CM_Train = metrics.confusion_matrix(y_train, y_pred)
# Test
y_pred = np.round(model.predict(X_test))
Reports_Test = pd.DataFrame(metrics.classification_report(y_test, y_pred, target_names=list(Labels_dict.values()),
                                                          output_dict=True)).T
CM_Test = metrics.confusion_matrix(y_test, y_pred)

Reports_Train = Reports_Train.reset_index().rename(columns ={'index': 'Train Set'})
Reports_Test = Reports_Test.reset_index().rename(columns ={'index': 'Test Set'})
                                                 
display(Reports_Train.style.hide(axis='index').set_properties(**{'background-color': 'HoneyDew', 'color': 'Black'}).\
        set_properties(subset=['Train Set'], **{'background-color': 'SeaGreen', 'color': 'White'}))
display(Reports_Test.style.hide(axis='index').set_properties(**{'background-color': 'Azure', 'color': 'Black'}).\
        set_properties(subset=['Test Set'], **{'background-color': 'RoyalBlue', 'color': 'White'}))

from HD_DeepLearning import Confusion_Mat
PD = dict(FS = (10, 6), annot_kws = 14, shrink = .6, Labels = list(Labels_dict.values()))
Confusion_Mat(CM_Train, CM_Test, PD = PD, n_splits = None)
Train Set precision recall f1-score support
Zero 0.883152 0.928571 0.905292 350.000000
One 0.924699 0.877143 0.900293 350.000000
accuracy 0.902857 0.902857 0.902857 0.902857
macro avg 0.903925 0.902857 0.902793 700.000000
weighted avg 0.903925 0.902857 0.902793 700.000000
Test Set precision recall f1-score support
Zero 0.859873 0.900000 0.879479 150.000000
One 0.895105 0.853333 0.873720 150.000000
accuracy 0.876667 0.876667 0.876667 0.876667
macro avg 0.877489 0.876667 0.876599 300.000000
weighted avg 0.877489 0.876667 0.876599 300.000000